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1 Abstract 

2 The evolution of animal colouration is importantly driven by sexual selection operating on traits used to 

3 transmit information to rivals and potential mates, which therefore, have major impacts on fitness. 

4 Reflectance spectrometry has become a standard colour-measuring tool, especially after the discovery of 

5 tetrachromacy in birds and their ability to detect UV. Birds' plumage patterns may be invisible to humans, 

6 necessitating a reliable and objective way of assessing colouration not dependent on human vision. 

7 Plumage colouration measurements can be taken directly on live birds in the field or in the lab (e.g. on 

8 collected feathers). Therefore, it is essential to determine which sampling method yields more repeatable 

9 and reliable measures, and which of the available quantitative approaches best assess the repeatability of 

10 these measures. Using a spectrophotometer, we measured melanin-based colouration in barn swallows' 

11 {Hirundo rustica) plumage. We assessed the repeatability of measures obtained with both traditional 

12 sampling methods separately to quantitatively determine their reliability. We used the ANOVA-based 

13 method for calculating the repeatability of measurements from two years separately, and the GLMM- 

14 based method to calculate overall adjusted repeatabilities for both years. We repeated the assessment for 

15 the whole reflectance spectrum range and only the human-visible part, to assess the influence of the UV 

16 component on the reliabilities of sampling methodologies. Our results reveal very high repeatability for lab 

17 measurements and a lower, still moderate to high repeatability, for field measurements. Both increased 

18 when limited to only the human-visible part, for all plumage patches except the throat, where we observed 

19 the opposite trend. Repeatability between sampling methods was quite low including the whole spectrum, 

20 but moderate including only the human-visible part. Our results suggest higher reliability for 

21 measurements in the lab and higher power and accuracy of the GLMM-based method. They also suggest 

22 UV reflectance differences amongst different plumage patches. 

23 Key words: Adjusted repeatability, bird plumage, colourful displays, sexual selection, 

24 spectrophotometry, tetrachromacy, ultraviolet. 
25 
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33 Introduction 

34 Colour vision involves the capacity to discriminate amongst different wavelengths of light, independent 

35 of their intensity [1,2]. Although colouration traits expressed in animals have proven essential components 

36 to understand the nature of selection, sexual selection in particular, only relatively recently have scientists 

37 appreciated the importance of a systematic understanding of both function and evolution of colouration, as 

38 well as the mechanisms that underpin it [3]. Birds in particular, due to their colourful displays and the role 

39 of their colour signals in fitness differentials, have traditionally been employed as prime model systems to 

40 understand the causes and implications of colour evolution. However, the mechanisms of colour vision 

41 and spectral information processing needed to understand how birds perceive colours remain areas with 

42 more questions than answers. 

43 Mate choice theory predicts that elaborately ornamented males can provide female birds with direct (if 

44 ornamental traits reflect individual condition, useful individual attributes or somatic quality independent of 

45 condition) and/or indirect fitness benefits ('good genes' or attractiveness for offspring - as conspicuous 

46 and costly male traits indicate highly heritable viability) [4-6]. Therefore, birds with more elaborate colourful 

47 displays are expected to enjoy a selective advantage given their higher mating chances [4,7]. 

48 Given the paramount significance of studies of bird plumage colouration in behavioural and evolutionary 

49 ecology, methods for reliably and objectively quantifying such colouration are critical. Methods traditionally 

50 used for colour assessment include colour ranks on an arbitrary scale [8], tristimulus colour models based 

51 on human vision, such as the Munsell system [9,10], reference colour swatches [11], or digital 

52 photography [12-14]. However, although all these methods offer simple and affordable ways of colour 

53 measurements in different analytical settings, they lack reliability and objectivity [15], as they are tuned to 

54 the human visual system instead of the bird visual system. 

55 Birds do not perceive colours in the same way as humans [16]. Birds have evolved a fourth cone type in 

56 their eyes, with a pigment that is sensitive to ultraviolet light. And although we are still far from 

57 understanding exactly how colours are perceived by birds [3], progress is being made towards 

58 understanding how colour vision works in general and how spectral information is processed by birds and 

59 other non-human animals [2,17,18]. Methods have been developed for comparing colour patterns as birds 

60 see them, using known properties of bird eyes and generating detailed formal descriptions of colour 

61 spaces and the equations used to plot them [19]. 

62 Since the 1990s, a wide range of further methods for analyzing spectrophotometry data have emerged 

63 [3]. This development stems largely from the revival of interest in UV vision and tetrachromacy in birds 

64 and the fact that birds can see colours that humans cannot experience [20,21]. This raises the possibility 
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65 of the existence of plumage patterns invisible to the human eye, and mate choice behaviours based on 

66 the ultraviolet part of the bird reflectance spectrum have been discovered [22,23]. Therefore, reliable and 

67 objective ways of quantifying bird coloration, not dependent on human vision, are at a premium. Miniature 

68 diode-array spectroradiometer systems, lighter, more portable and affordable than previous spectrometry 

69 systems but as precise and objective in colour quantification, have provided popular tools for colour 

70 communication studies [3]. 

71 Two traditional ways of assessing bird plumage colouration with spectrophotometers have been 

72 reported in the literature. Measurements may be taken directly on the bird, applying the pointer of the 

73 spectrophotometer to plumage patches as they occur in situ [24-29]. Alternatively, measurements may be 

74 taken in the lab, with feather samples collected from the field, applying the pointer (the cone-shaped piece 

75 of black plastic on top of the probe used to direct the light from the source in a given angle) to "plumage 

76 patches" created by mounting these feathers on a flat surface in a way that mimics the original plumage 

77 structure [30-38]. Despite the popularity of the use of spectrophotometers for colour assessment and the 

78 growing number of studies on bird colouration, few studies have rigorously assessed the consistency of 

79 both methods for measuring the colouration of plumage patches, and the repeatability of results obtained 

80 when using either one or the other (see [39] for a comparison of carotenoid-based plumage coloration in 

81 great tits). 

82 Additionally, there is little consensus on how to best quantify the reliability, or repeatability, of spectral 

83 measurements. The most common measure of repeatability, or more precisely, the coefficient of intraclass 

84 correlation (/!), can be formally defined as the proportion of the total variance explained by differences 

85 among groups [40,41]: 
86 

87 /-, = a//(a/+a/) (eqnl) 
88 

89 where a„ is the between-group variance and of the within-group variance, whereas the sum of both 

90 comprises the total phenotypic variance [41]. Until recently, the most common ways to estimate 

91 repeatabilities from data with Gaussian errors have employed the correlation-based method [40] or the 

92 ANOVA-based method, commonly used by behavioural and evolutionary ecologists [42,43]. However, 

93 Nakagawa and Schielzeth [41] developed an innovative R-based function for calculating GLMM-based 

94 repeatability estimates, which allows for confounding variables to be factored out and calculates the 

95 confidence intervals for each repeatability calculation, inferred from distributions of repeatabilities obtained 

96 by parametric bootstrapping. 
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97 Our aim is twofold. We first compare two different methods for measuring melanin-based plumage 

98 ornamentation to determine which one allows to obtain more repeatable and reliable measures. The 

99 methods consist in measuring the feather colouration either directly on the bird in the field or on feather 

100 samples in the laboratory. We then compare two statistical methods to assess repeatability, one ANOVA- 

101 based and another one GLMM-based, and determine the pros and cons of each of them. We hypothesize 

102 that measuring feather colouration in the lab will yield more repeatable and reliable measures, as it avoids 

103 the logistic, technical and animal welfare limitations imposed by the field method and provides more 

104 controlled conditions during the measurements. Also, we predict more realistic and accurate repeatability 

105 estimates with the GLMM-based statistical method, as it allows us to introduce more sources of variation 

106 in each analysis. We use the European barn swallow Hirundo rustica as a model species. To the best of 

107 our knowledge, this is the first time GLMM-based repeatability estimates have been used to assess the 

108 reliability of melanin-based plumage colouration measurements. 
109 

110 Methods 

111 Field work and data collection 

112 Field work was carried out during May-August 2009 and March-August 2010 in multiple sites, mostly 

113 farmlands, around the Falmouth area in Cornwall, UK. A total of 59 adult European barn swallows (21 in 

1 14 2009 and 38 in 2010), were caught using mist nets, ringed, morphometric measurements taken, plumage 

115 reflectance spectra quantified in the field and feather samples collected for subsequent assessment in the 

116 lab. 

117 Colour was quantified based on Endler and Mielke's [19] approach, using a USB2000 

118 spectrophotometer (Ocean Optics, Dunedin, Florida), and a xenon flash lamp (Ocean Optics). Before 

1 19 using the spectrophotometer, we calibrated it by setting the white and black references, i.e., we "told" the 

120 machine which colour we want to be considered as the 100% reflectance (white) standard, and the 0% 

121 reflectance (dark) standard, so that the rest of the colour measurements are determined in relation to 

122 those maximum and minimum possible reflectance values. We used a WS-1 SS Diffuse Reflectance 

123 Standard, a diffuse white plastic >98% reflective from 250-1500 nm, as the white reference (100% 

124 reflectance), and a piece of black velvet as the dark standard (0% reflectance) to correct for the noise 

125 when no light is reaching the sensor. At the far end of the reflection probe/light source, we put a non- 
126 reflective black pointer cut in a 45 degree angle, to avoid mismeasurement derived from the white light 

127 directly reflected by the plumage reaching the sensor [15]. Using the spectra acquisition software package 

128 OOlBase (Ocean Optics), we measured the reflectance of four body regions, namely the throat, breast, 
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129 belly and vent of each bird. For the measurements of feather samples in the lab, we collected enough 

130 feathers from live birds as to being able to mount them one on top of the other and simulate the original 

131 pattern found on live birds. We mounted the feathers on a piece of black velvet to avoid background noise. 

132 Once we had tested for the reliability of the measures obtained with both sampling methods separately, 

133 we averaged the three measurements for each method and used these average values to test the 

134 comparability between field and lab measurements. 

135 We also used Endler and Mielke's [19] method to calculate brightness, chroma and hue, parameters 

136 generally used to specify a colour. Using their equations and the mathematical software Matlab (The 

137 MathWorks Inc., Natick, MA), we got the spectral sensitivity functions of the cones corrected for the cut 

138 points of oil droplets, calculated the quantal catch for each photoreceptor and converted those quantal 

139 catches into dimensional colour space coordinates in a tetrahedral colour space (Fig. 1). Chroma is 

140 defined as the strength of the colour signal or the degree of difference in stimulation among the cones, 

141 and it is proportional to the Euclidean distance from the origin, that is, the distance from the bird grey 

142 (achromatic) point to each point, specified by three space coordinates. Perception of hue depends of 

143 which cones are stimulated, and in tetrahedral colour space, it is defined by the angle that a point makes 

144 with the origin. As bird colour space is 3D, hue is defined by two angles, analogous to latitude and 

145 longitude in geography [19]. 

146 Brightness is defined as the summed mean reflectance across the entire spectral range (R300-700; 

147 [44,45]). As well as these parameters, we included UV chroma, a measure of spectral purity, into our 

148 analysis, which was calculated as the proportion of reflectance in the UV part of the spectrum (R 300 -4oo) in 

149 relation to the total reflectance spectrum (R300-700; [46]). Cone sensitivities and oil droplet cut points were 

150 taken from Bowmaker et al. [16], Hart [47], Vorobyev et al. [48], Govardovskii et al. [49], and Hart and 

151 Vorobyev [50]. 

152 Although all the avian families investigated show plumages reflecting significant amounts of UV light 

153 (see [51] for a review), in the particular case of barn swallows, ventral plumage shows a noisy reflectance 

154 pattern in the UV part of the spectrum and does not exhibit a clear ultraviolet reflectance peak (Fig. 2; 

155 [34]). For this reason, we calculated the same colour variables both including and not including the UV 

156 part of the reflectance spectrum, and carried out a repeatability assessment using either the whole 

157 reflectance spectrum (300-700 nm) or only the human-visible part (400-700 nm). When using only the 

158 visible part, we did not include UV chroma, for obvious reasons, nor hue, as values are identical in both 

159 cases. 
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160 Due to the way data were collected, the three plumage colouration measurements taken in the field in 

161 2009 covered a wider area of each plumage patch than the measurements made on feather samples, 

162 which were restricted to the area covered by the bunch of feathers plucked from each patch on each 

163 individual. In 2010, however, the three field measurements were taken approximately in the same 

164 plumage area for each patch. 
165 

166 ANOVA-based method 

167 In order to test for the reliability of both sampling procedures (described above) separately, we 

168 calculated the repeatability for colour variables in the four patches for the different procedures according 

169 to Lessells and Boag [43], Senar [52] and Quesada and Senar [39]. Repeatabilities were computed from 

170 the mean squares of ANOVA on three repeated measures per individual. Both in field and lab procedures, 

171 the second and third measurements were made after removing the reflection probe/light source and the 

172 black pointer on top of it and placing it again on the colouration patch. IV took all the measurements. 

173 Once we calculated the "within-method" repeatabilities, i.e., the repeatabilities for each sampling 

174 method, we averaged the three measurements per patch per individual and assessed the "between- 

175 method" repeatability, i.e. the repeatability of measurements across procedures, but this time the ANOVA 

176 was carried out on two repeated measures per individual, one from the field and another one from the lab. 

177 We repeated this process for both 2009 and 201 0 data separately. 
178 

179 GLMM-based method 

180 We used a modified version of the R function R.Anson, which is itself a modification of rpt.remlLMM 

181 function [41]. We fitted two random-effect terms (individual identity and year) in our linear mixed-effects 

182 models, and calculated the adjusted repeatability estimate as: 
183 

1 84 r, = a 2 l{a a 2 + a 2 + a 2 ) (eqn 2) 
185 

186 where a r 2 is the year variance. 

187 In order to have a general idea of repeatability for each patch, we included all the colour variables in a 

188 principal component analysis (PCA) and calculated the repeatability (and confidence intervals) for the first 

189 component (PC1) within each plumage patch. 
190 
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191 As we conducted multiple statistical tests on data subsets that are not likely to be biologically 

192 independent of each other (i.e. different components of the spectra, same metrics on different years, or 

193 same metrics in the lab and in the field), there was an increased probability of type I error rates. To control 

194 for this increased probability, we corrected our p-values for multiple testing based on the sequentially 

195 rejective Bonferroni procedure of Holm [53], using the p. adjust function from the stats package in R [54]. 

196 All the statistical analyses were carried out using R [54,55]. 
197 

198 Ethics Statement 

199 AMcG and MRE had a Home licence which covered taking feather samples as well as other activities 

200 (Home Office Project Licence Number 30/2740). MRE was the project licence holder. 

201 All work was carried out on private residencies and farms with the express permission of the landowners 

202 in question. Contact details of the landowners can be provided by the authors upon request and after 

203 asking the landowners for permission, in order to respect their privacy. 

204 Our field study did not involve any endangered or protected species. The specific locations of the study 

205 are provided as supporting material. 

206 Birds were caught using mist nets under licence (AMcG BTO A licence Holder No. 4947). 
207 

208 Results 

209 ANOVA analyses 

210 In 2009, when including the whole spectrum in the analyses, measuring plumage colouration in the lab 

211 proved to be a reliable method. Brightness, UV chroma, chroma and hue latitude and longitude being 

212 highly repeatable for all the patches, and hue latitude in the throat being less repeatable (/pO.418, 

213 p2i,44=3.1 57, P=0.01; Table 1). The method of measuring the plumage colouration in the field (in three 

214 different points, covering a wider area of each patch) was also quite consistent but with overall lower 

215 values of repeatability, although still reasonably high, for all the variables and patches, being especially 

216 low for hue latitude in breast (Ai=0.382, F 20 , 4 2=2.856, P=0.025) and vent (r F 0.394, F 21 , 4 4=2.955, P=0.017; 

217 Table 1). 

218 When including only the visible part of the spectrum in the analysis, overall repeatability declined. The 

219 lab method again proved to be the most reliable, with high values of repeatability for brightness and 

220 chroma in all the patches. The field procedure was moderately repeatable for belly and throat, but showed 

221 low repeatability for brightness in the breast (/pO.36, F 20 ,42=2.6885, P=0.038) and for chroma in the vent 

222 (r F 0.428, F 21 , 44 =3.241 , P=0.007; Table 2). 
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223 The values of repeatability across the field and laboratory procedures were very low for all the patches 

224 measured, both considering the whole reflectance spectrum or only the visible part (ri<0.35 and R>0.05 in 

225 all cases), suggesting a lack of consistency across the two assessment methods for melanin-based 

226 plumage colouration. Repeatabilities of brightness measurements were slightly higher for the whole 

227 spectrum than for only the visible part, in all the patches except from the vent. However, in this case, 

228 including only the visible part of the spectrum yielded much more repeatable chroma values than including 

229 the whole range, sometimes even turning negative repeatability values into positive, e.g. for the belly 

230 (whole range: r F -0.507, F 20 , 2 i=0.327, P=1 ; only visible range: r F 0.346, F 20 , 2 i=2.056, P=0.599), or the vent 

231 (whole range: r F -0.758, F 21 , 22 =0.138, P=1 ; only visible range: n=0.296, F 2122 =1.842, P=0.657; Table 1 and 

232 Table 2). 
233 

234 In 2010, when including the whole spectrum in the analyses, repeatability measurements in the field 

235 (taken approximately in the same point within each patch) yielded considerably higher results than in 

236 2009, with all the r, values above 0.60, except from hue latitude in the throat (r,=0.515, F 37i75 =4.186, 

237 P<0.0001), and with most of the values ranging from 0.74 to 0.91, except for brightness in the breast 

238 (r,=0.611, F 37 , 76 =5.710, P<0.0001), hue latitude in the belly (r F 0.63, F 37 , 76 =6.1, P<0.0001), hue latitude in 

239 the vent (r F 0.629, F 37 , 76 =6.088, P<0.0001) and hue longitude in the vent (r F 0.679, F 37 , 76 =7.356, 

240 P<0.0001). In the lab, repeatability values ranged from 0.71 to 0.95 in most of the patches, except from 

241 hue latitude in the throat (r F 0.65, F 3776 =6.569, P<0.0001), and repeatability was overall higher than when 

242 measuring it on live birds, except from UV chroma in the belly (r F 0.857, F 37 , 76 =1 8.986, P<0.0001), breast 

243 (r F 0.788, F 37 , 76 =12.1 17, P<0.0001) and vent (r F 0.819, F 37 , 76 =1 4.571, P<0.0001) and hue latitude in the 

244 breast (r F 0.722, F 3776 =8.808, P<0.0001), where it was slightly lower. Repeatability values were similar to 

245 the ones obtained in 2009 (Table 3). 

246 When doing the analysis including only the visible part of the spectrum, measuring colouration in the lab 

247 was again the most reliable method of both, with all the r, values above 0.91, except for chroma in the 

248 throat (r F 0.881, F 37 , 76 =22.964, P<0.0001). Field procedure still yielded high repeatability values, with 

249 brightness in the breast (r F 0.569, F 37 , 76 =4.959, P<0.0001) and chroma in the vent (r F 0.696, F 37 , 76 =7.839, 

250 P<0.0001) being the only measurements with values below 0.81. For both methods, repeatability values 

251 were higher than in 2009 for all the variables within all the plumage patches (Table 4). 

252 Repeatabilities across field and lab methods in 2010 were quite heterogeneous including the whole 

253 spectrum in the analyses: moderately high for hue longitude in the belly (r F 0.794, F 37 , 38 =8.732, P<0.0001) 

254 and breast (r F 0.657, F 3738 =4.818, P<0.0001), moderate for vent hue latitude (r F 0.463, F 37 , 38 =2.723, 
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255 P=0.018) and longitude (r F 0.561, F 37 , 3 8=3.553, P=0.001), belly hue latitude (r F 0.431, F 37 , 38 =2.515, 

256 P=0.034) and breast brightness (n=0.482, F 37 , 38 =2.861 , P=0.012), and low for breast chroma (r i= 0.326, 

257 F 37 , 38 =1.966, P=0.21), throat UV chroma (r F 0.321, F 37 , 38 =1 .944, P=0.312) and chroma (r F 0.274, 

258 F 3738 =1.755, P=0.576) and vent brightness (r F 0.349, F 3738 =2.070, P=0.141). For the rest of the cases, 

259 repeatabilities were very low (ri<0.23 and P>0.05 for all the cases). When including only the visible part of 

260 the spectrum, repeatability was moderate to high and had a significant effect for both brightness and 

261 chroma in the breast, and for chroma in the vent and in the belly, whereas it was quite low and non- 
262 significant for brightness in the belly (r F 0.217, F 37 , 38 =1 .556, P=0.899), and very low for both variables in 

263 the throat. Repeatability values for chroma in all the patches except for the throat were much higher than 

264 when we included the whole spectrum range, e.g. in the belly (whole range: r F -0.049., F 3738 =0.906, P=1 ; 

265 only visible range: r F 0.75, F 37 , 38 =7.005, P<0.0001) and in the vent (whole range: r F 0.224, F 37 , 38 =1 .579, 

266 P=0.657; only visible range: r F 0.41 6, F 37 , 38 =2.428, P=0.047; Table 3 and Table 4). 
267 

268 Generalized Linear Mixed Model analyses 

269 When we included the whole spectrum in the analyses, for all the principal component analyses carried 

270 out within each patch for field and lab measurements, PC1 accounted for more than a 53% of the total 

271 variance, except for vent measurements in the field (where it explained a 49% of the total variance) and 

272 for measurements in the throat (where it explained between 44% and 47%). When including only the 

273 visible part of the spectrum, PC1 explained a 67% of the total variance for vent measurements in the field, 

274 and between 75% and 93% in the rest of the cases. 

275 Repeatability of feather colour measurements was much higher and confidence intervals smaller when 

276 quantifying colouration from measurements taken in the lab than when doing it on measurements taken on 

277 live birds in the field, both including the whole spectrum in the analyses or only the visible part, being 

278 particularly high in the breast (whole range: r F 0.916, 95%CI=0.855, 0.943, P<0.0001; visible range: 

279 r F 0.927, 95%CI=0.872, 0.95, P<0.0001). All the repeatability values from lab measurements ranged 

280 between 0.71 and 0.93 and were highly significant (P<0.0001). 

281 When using field measurements, repeatabilities were still moderately high (all r, values above 0.50) and 

282 higher when including only the visible spectrum range in the analyses than including the whole range, 

283 except in the throat (whole range: r F 0.629, 95%CI=0.303, 0.874, P<0.0001; visible range: r F 0.564, 

284 95%CI=0.266, 0.81 6, P<0.0001 ; Fig. 3). 

285 Repeatabilities across both field and lab methods yielded higher results when we included only the 

286 visible spectrum in the analyses than when we included the whole spectrum. Leaving the values for throat 
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287 apart, as here repeatability was not significantly different from zero no matter the spectrum range we 

288 included in the analyses, r, ranged between 0.19 and 0.41 when including the whole spectrum, and 

289 between 0.44 and 0.62 when including only the visible part (Fig. 3). All the repeatabilities in belly, breast 

290 and vent were significant except for the belly when including the whole spectrum (r,=0.189, 95%CI=0, 

291 0.415, P=0.069), but it became significant and higher when only the visible spectrum was included in the 

292 analyses (r i= 0.503, 95%CI=0.281 , 0.667, P<0.0001 , Fig. 3). 
293 

294 Discussion 

295 Measuring plumage ornamentation to gain repeatable and reliable measures 

296 Measuring plumage colouration from feather samples in the lab proved to be a highly reliable method, 

297 with high values of repeatability in general for all the variables and patches in 2009, 2010 and when 

298 applying the GLMM-based method for both years. Measurements taken directly on specimens in the field 

299 showed a reasonable extent of reliability, but with overall lower values of repeatability compared to the lab 

300 for most of the variables measured on different patches. As an exception, some variables in the throat in 

301 2009 and UV chroma measurements in the belly, breast and vent in 2010 when considering the whole 

302 spectrum, yielded higher values of repeatability when measured in the field, only when applying the 

303 ANOVA-based method. 

304 A potential explanation for our findings is that the throat patch is smaller and much darker than the rest 

305 of the patches. The feathers of the throat patch are also considerably smaller. Therefore, it is often quite 

306 difficult to obtain a reliable reflectance measurement with such a limited amount of photons reaching the 

307 spectrophotometer probe. Also, it is more difficult to create a "plumage patch" in the lab with a feather 

308 arrangement similar to the bird's original one and big enough to be able to apply the spectrophotometer 

309 pointer to it. An alternative explanation is that the UV part of the spectrum shows a highly noisy pattern, 

310 thus highly consistent UV chroma repeatabilities across field or lab measurements may not necessarily be 

311 expected. This latter alternative may explain why repeatability values in the lab for UV chroma 

312 measurements were higher in the field in 2010, whereas the rest of the repeatability values consistently 

313 tended to be higher in the lab. 

314 When applying the GLMM-based method, repeatability was moderate to high within all the patches for 

315 field measurements, whereas it was considerably higher, and the confidence intervals considerably 

316 narrower, for lab measurements. This finding suggests that lab-based measures are more reliable ways of 

317 assessing melanin-based colouration. 
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318 When comparing both methods, the values obtained in 2009 for different variables measured in different 

319 plumage patches directly on field birds were poorly repeatable compared to the values obtained for the 

320 same variables measured from feather samples in the lab, and non-significant in all cases. In 2010, in 

321 contrast, repeatabilities were higher and significant for certain metrics in certain patches only. These 

322 results stand in marked contrast to the positive results of another study, which compared the 

323 repeatabilities between both colouration assessment procedures for carotenoid-based plumage [39]. 

324 There can be several reasons for this difference: for example, due to the different characteristics of the 

325 two types of pigments, carotenoid-derived colouration is more variable among individuals than melanin- 

326 based colouration [56], and repeatability of a character increases with variability [52]. In order to increase 

327 the repeatability of some measurements, a possible solution could be to increase the number of 

328 measurements, for example from three to five, as it has already been done by several authors [22,28,32]. 

329 Unfortunately, this is not an option when working with live birds in the field, as we would be increasing the 

330 manipulation times and, consequently, the stress levels to an unacceptable degree, although it can be 

331 applied when assessing colouration in the lab on feather samples [39]. 

332 As mentioned above, the three plumage colouration measurements taken in the field in 2009 covered a 

333 wider area of each plumage patch than in 2010, when the three field measurements were taken 

334 approximately in the same plumage area for each patch, being this area also the one from which the 

335 majority of feathers for lab measurements were taken. Due to the different ways in which data were 

336 collected in the field each year, repeatability of 2009 field measurements can be taken as an estimate of 

337 colouration consistency within the plumage patches. Our results suggest a moderate to high within-patch 

338 consistency for melanin-based ventral colouration in our model species. The comparability of both 

339 procedures in 2009 may have been compromised, although the repeatability of the 2010 samples was 

340 higher even for lab measurements, especially when considering only the visible part of the spectrum. This 

341 may be indicative of higher patch colouration homogeneity in 2010. Also, the higher lab repeatabilities in 

342 2009 than the field ones could be argued to be a function of field measures being taken over bigger 

343 plumage areas. But the fact that we still get higher lab than field measurements in 2010 suggests practical 

344 limitations to repeatability measurements in the field. 

345 Consequently, collecting feathers from birds and assessing their colouration in the lab, as well as being 

346 more convenient, minimising risk to a sensitive device like a spectrophotometer and reducing handling 

347 times of the animals [39], is a more reliable method for assessing melanin plumage colouration than 

348 doing so directly on live birds, according to our results. 

349 
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350 Assessing repeatability 

351 The GLMM-based method [41], applied to data from both years, allowed us to control for year effects by 

352 adding the year variance into the total variance calculation, so that we could obtain the adjusted 

353 repeatability for data from both years. The PCA allowed to create composite variables accounting for 

354 almost 50% of the total variance in the metrics taken from each patch, which therefore, made it possible to 

355 estimate the overall repeatability within each patch for both methods separately and across methods. 

356 The possibility of calculating adjusted repeatabilities by including year as a random factor, together with 

357 the reduction in the number of variables accounting for a great proportion of the total variance achieved by 

358 the PCA, considerably reduced the amount of multiple tests necessary for repeatability calculation. Thus, 

359 the p-values obtained with this method were less affected by Bonferroni corrections than those obtained 

360 with the ANOVA-based method, reducing the probability of type II errors and increasing the power of this 

361 repeatability-calculation method. 

362 Additionally, this reduction in the number of variables offered us the possibility of getting a general view 

363 of the colour measurement repeatability within each patch. As our variables are all derived from the 

364 reflectance spectra obtained on each colouration patch through the spectrophotometer, comparing the 

365 repeatability of all the different variables within the patches, as well as increasing the probability of type II 

366 errors, can be redundant. The PCA-GLMM combination, in contrast, made it possible to compare the first 

367 principal components within each patch, which can be regarded as whole spectrum estimates, as they 

368 gather a great proportion of the total variance contained in all the variables originally extracted from the 

369 reflectance spectra measurements. And it also allowed us to do so with data from both years. 

370 Finally, thanks to the use of the GLMM-based method, we could calculate confidence intervals, useful 

371 indicators of the reliability of our repeatability estimates, additionally to just p-values. That way, and 

372 together with the advantages mentioned above, it was possible to get a more complete and reliable overall 

373 perspective of the question being studied. 

374 The fact that almost all the repeatability measurements, and especially the repeatabilities across field 

375 and lab methods (in patches other than the throat), were higher when including only the human-visible 

376 spectrum in the analyses , suggests that the noisy reflectance pattern in the UV part of the spectrum may 

377 be decreasing the repeatability and underestimating the comparability of the two methods. For throat 

378 plumage, however, we observed the opposite trend, with higher repeatability values when including the 

379 whole spectrum, which could be indicative that the UV part of the spectrum is more important in the throat 

380 than in the rest of the patches. We find support for this idea when looking at reflectance spectra plots for 

381 different patches (Fig. 2): throat reflectance spectra, although showing also quite a noisy pattern for the 
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382 UV part, and unlike the rest of the patches' spectra, tends to show UV reflectance peaks in both sexes. 

383 Further work is needed to find out whether there are UV reflectance differences amongst different 

384 plumage patches. 

385 

386 Conclusions 

387 Our results suggest that collecting feathers from live animals and assessing colouration in the lab is a 

388 better approach for measuring plumage ornamentation in order to gain repeatable and reliable results 

389 compared to direct measures on live birds in the field. In addition, since it is easier on equipment and 

390 minimises the length of time birds need to be handled (minimising the stress levels inflicted on them), 

391 feather sampling would appear to be the best method available. 

392 In addition, from a statistical point of view, our results support the superiority of the GLMM-based method 

393 [41] for repeatability calculation, as it enables random factors to be accounted for and can calculate 

394 adjusted repeatability values, which are more accurate than those calculated using other (e.g., ANOVA) 

395 methods and increase the power of the tests. The reduction in the number of variables gives us a general, 

396 patch by patch overview of the problem being studied, and the confidence intervals allow us to test the 

397 reliability of our own repeatability estimates. 

398 Finally, we have also shown that it is important to check for the effect that the UV part of the spectrum 

399 could be exerting on repeatability calculations, as the capability of the plumage to reflect the UV light could 

400 have different biological implications in different plumage patches. 
401 
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Table 1 : ANOVA-derived Repeatabilities in 2009 plumage colouration measurements taken 
from live birds in the field, feather samples in the lab, and across both procedures (UV+Visible 
spectrum). 





UV+Visible range 




Belly 


Breast 


Throat 




Vent 




F 


n 


F 


1 


F 


n 


F 


f\ 


Repeatability field 


















Rrinhtnocc 
o\ ly i in icoo 


6.0446 


0.627*** 


3.1183 


0.414* 


12.273 


0.79*** 


6.1868 


0.634*** 


I l\/ Phrnma 
U V OlllUITld 


10.203 


0.754*** 


14.171 


0.814*** 


13.861 


0.811*** 


5.3196 


0.59*** 


onrorrici 


9.0161 


0.728*** 


1 1 .238 


0.773*** 


10.714 


0.764*** 


4.5611 


0.543*** 


nue lainuue 


4.0921 


0.508*** 


2.8564 


0.382* 


4.4146 


0.532*** 


2.9545 


0.394* 


nue longnuue 


13.025 


0.8*** 


9.1422 


0.731*** 


24.348 


0.886*** 


4.6771 


0.55*** 


Repeatability lab 


















Brightness 


13.188 


0.802*** 


14.777 


0.821*** 


8.2092 


0.706*** 


20.212 


0.865*** 


UV Chroma 


46.493 


0.938*** 


23.015 


0.88*** 


34.895 


0.919*** 


25.561 


0.892*** 


Chroma 


42.489 


0.932*** 


26.975 


0.896*** 


62.481 


0.954*** 


29.052 


0.903*** 


Hue latitude 


1 1 .986 


0.785*** 


6.5101 


0.647*** 


3.1573 


0.418** 


23.024 


0.88*** 


Hue longitude 


25.986 


0.893*** 


9.2833 


0.734*** 


6.0119 


0.625*** 


27.291 


0.898*** 


Comparison field-lab 


















Brightness 


1 .4636 


0.188 


2.0702 


0.349 


0.865 


-0.072 


1.8043 


0.287 


UV Chroma 


1.3153 


0.136 


1.3591 


0.152 


0.7499 


-0.143 


0.9006 


-0.052 


Chroma 


0.327 


-0.507 


1.3715 


0.157 


1.0143 


0.007 


0.1376 


-0.758 


Hue latitude 


0.9084 


-0.048 


0.8635 


-0.073 


1 .2523 


0.112 


0.706 


-0.172 


Hue longitude 


1 .7364 


0.269 


1.7513 


0.273 


0.6709 


-0.197 


1.1055 


0.05 



'***' P<0.001; P<0.01; '*' P<0.05 '.' P<0.1 
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Table 2: ANOVA-derived Repeatabilities in 2009 plumage colouration measurements taken 
from live birds in the field, feather samples in the lab, and across both procedures (Only Visible 
part of the spectrum). 





Visible range 


Belly 
F r, 


Breast 
F ii 


Throat 
F ii 


F 


r, 


Repeatability field 


















Brightness 


4.8958 


0.565*** 


2.6885 


0.36* 


11.221 


0.773*** 


5.6072 


0.606*** 


Chroma 


6.7724 


0.658*** 


6.4868 


0.646*** 


6.6207 


0.652*** 


3.2416 


0.428** 


Repeatability lab 


















Brightness 


11.818 


0.783*** 


15.892 


0.832*** 


6.9936 


0.666*** 


19.326 


0.859*** 


Chroma 


28.865 


0.903*** 


21.117 


0.821*** 


29.402 


0.873*** 


23.838 


0.901*** 


Comparison field-lab 


















Brightness 


1.461 


0.187 


2.0097 


0.335 


0.6804 


-0.19 


1.8902 


0.308 


Chroma 


2.0563 


0.346 


1.6683 


0.25 


1.1396 


0.065 


1.8415 


0.296 



P<0.001; '**' P<0.01; '*' P<0.05 '.' P<0.1. 
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Table 3: ANOVA-derived repeatabilities in 2010 plumage colouration measurements taken 
from live birds in the field, feather samples in the lab, and across both procedures (UV+Visible 
spectrum). 





UV+Visible range 
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Breast 


Throat 




'ent 




F 


h 


F 


r, 


F 


r, 


F 


r, 


Repeatability field 


















Briahtness 


1 6.422 


0.837*** 


5.7104 


0.611*** 


17.398 


0.846*** 


14.534 


0.819*** 


1 IV Chroma 

\J V v_y 1 II \J\ 1 1 d 


25.333 


0.89*** 


15.853 


0.832*** 


12.357 


0.791*** 


16.829 


0.841*** 


O h rn m a 

w I 1 1 KJ I I I a 


24.037 


0.885*** 


14.749 


0.821*** 


22.212 


0.876*** 


12.377 


0.792*** 


H i ip latiti iHp 


6.0997 


0.630*** 


9.741 1 


0.744*** 


4.1857 


0.515*** 


6.0878 


0.629*** 


Hue longitude 


23.669 


0.883*** 


31.363 


0.910*** 


9.8914 


0.748*** 


7.3556 


0.679*** 


Repeatability lab 


















Brightness 


55.197 


0.947*** 


31.387 


0.910*** 


30.9 


0.909*** 


44.036 


0.934*** 


UV Chroma 


18.986 


0.857*** 


12.117 


0.788*** 


1 7.544 


0.847*** 


14.571 


0.819*** 


Chroma 


25.375 


0.890*** 


25.936 


0.893*** 


27.679 


0.899*** 


39.854 


0.928*** 


Hue latitude 


8.3908 


0.711*** 


8.8076 


0.722*** 


6.5692 


0.650*** 


9.1422 


0.731*** 


Hue longitude 


37.357 


0.924*** 


31.683 


0.911*** 


1 1 .664 


0.781*** 


37.517 


0.924*** 


Comparison field-lab 


















Brightness 


1.3041 


0.131948 


2.8612 


0.482* 


0.7642 


-0.134 


2.0703 


0.349 


UV Chroma 


0.5572 


-0.284 


1.059 


0.029 


1 .9441 


0.321 


1 .5873 


0.227 


Chroma 


0.9059 


-0.049 


1.9662 


0.326 


1.755 


0.274 


1 .5788 


0.224 


Hue latitude 


2.5152 


0.431* 


1 .2479 


0.110243 


0.7839 


-0.121 


2.7232 


0.463* 


Hue longitude 


8.7322 


0.794*** 


4.8175 


0.657*** 


0.9731 


-0.014 


3.5525 


0.561** 



P<0.001; P<0.01; '*' P<0.05 '.' P<0.1 
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Table 4: ANOVA-derived Repeatabilities in 2010 plumage colouration measurements taken 
from live birds in the field, feather samples in the lab, and across both procedures (Only Visible 
part of the spectrum). 





Visible range 


Belly 

f r, 


Breast 
F r, 


Throat 
F r, 


F 




Repeatability field 


















Brightness 


14.408 


0.817*** 


4.959 


0.569*** 


16.59 


0.839*** 


13.527 


0.806*** 


Chroma 


22.566 


0.878*** 


22.163 


0.876*** 


20.007 


0.864*** 


7.8386 


0.696*** 


Repeatability lab 


















Brightness 


59.713 


0.951*** 


34.778 


0.918*** 


34.133 


0.917*** 


49.55 


0.942*** 


Chroma 


51.455 


0.944*** 


57.897 


0.950*** 


22.964 


0.881*** 


72.314 


0.959*** 


Comparison field-lab 


















Brightness 


1 .5559 


0.217 


3.1819 


0.522** 


0.679 


-0.191 


2.3696 


0.406. 


Chroma 


7.0047 


0.750*** 


7.0261 


0.751*** 


1.0876 


0.042 


2.4275 


0.416* 



'***' P<0.001; '**' P<0.01; '*' P<0.05 '.' P<0.1 



23 



Downloaded from http://biorxiv.org/on September 18, 2014 



u/v 




(crude approximation) 



Figure 1: The avian tetrahedral colour space (from Endler and Mielke, 2005). 
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Figure 2: Reflectance spectra for belly, breast, throat and vent patches of male and female barn 

swallow (regression lines). 
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Rapetahll Ity Lab 





Figure 3: GLMM-derived repeatabilities (± 95% CI) in 2009-2010 plumage colouration 
measurements taken a) from live birds in the field, b) from feather samples in the lab, and c) 
across both procedures, both when including the whole light spectrum or only the human-visible 

spectrum in the analyses. 
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